Query Log Mining in Search Engines

نویسندگان

  • Marcelo Gabriel Mendoza Rocha
  • Carlos Hurtado
  • Mark Levene
چکیده

The Web is a huge read-write information space where many items such as documents, images or other multimedia can be accessed. In this context, several information technologies have been developed to help users to satisfy their searching needs on the Web, and the most used are search engines. Search engines allow users to find Web resources formulating queries (a set of terms) and reviewing a list of answers. One of the most challenging goals for the Web community is to design search engines that allow users to find resources semantically connected to their queries. The huge size of the Web and the vagueness of the most commonly used terms to formulate queries still poses a huge problem to achieves this goal. In this thesis we propose to explore the user’s clicks registered in the search engine logs in order to learn how users search and also in order to design algorithms that could improve the precision of the answers suggested to users. We start by exploring the properties of the user’s click data. This exploration allows us to determine the sparse nature of the data providing users behavior models that help us to understand how users search in search engines. Secondly, we will explore the user’s click data to find useful associations among queries registered in the logs. We will focus the efforts on the design of techniques that will allow users to find better queries than the original query. As an application, we will design query reformulation methods that will help users to find more useful terms in order to represent their needs. On using document terms we will build vectorial representations for queries. By applying clustering techniques we are able to compute clusters of similar queries. Using query clusters, we provide techniques for document and query suggestions which allows us to improve the precision of the answer lists. Finally we will design query classification techniques that allow us to find concepts semantically related with the original query. In order to do this, we classify the users’s queries into a Web directory. As an application, we provide methods for the maintenance of the directory.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Popular Clicks\' Pattern of Teen Users for Query Recommendation

Search engines are still the most important gates for information search in internet. In this regard, providing the best response in the shortest time possible to the user's request is still desired. Normally, search engines are designed for adults and few policies have been employed considering teen users. Teen users are more biased in clicking the results list than are adult users. This leads...

متن کامل

Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...

متن کامل

Privacy in Web Search Query Log Mining

Web search engines have changed our lives enabling instant access to information about subjects that are both deeply important to us, as well as passing whims. The search engines that provide answers to our search queries also log those queries, in order to improve their algorithms. Academic research on search queries has shown that they can provide valuable information on diverse topics includ...

متن کامل

Personalized Concept and Fuzzy Based Clustering of Search Engine Queries

Personalized search is an important research area that aims to resolve the ambiguity of query terms. Since queries submitted to search engines tend to be short and ambiguous, they are not likely to be able to express the user’s precise needs. To alleviate this problem, some search engines suggest terms that are semantically related to the submitted queries so that users can choose from the sugg...

متن کامل

Mining Query Logs

Web Search Engines (WSEs) have stored in their query logs information about users since they started to operate. This information often serves many purposes. The primary focus of this tutorial is to introduce to the discipline of query log mining. We will show its foundations, by giving a unified view on the literature on query log analysis, and also present in detail the basic algorithms and t...

متن کامل

Mining related queries from Web search engine query logs using an improved association rule mining model

finding relevant information on a given topic on the Web is becoming increasingly difficult. Web search engines hence become one of the most popular solutions available on the Web. However, it has never been easy for novice users to organize and represent their information needs using simple queries. Users have to keep modifying their input queries until they get expected results. Therefore, it...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007